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@ Method and system for speech data compression and regeneration. 



(St) a method and system for creating a compres- 
sed data representation of a human speech 
utterance which may be utilized to accurately 
regenerate the human speech utterance. First, 
the location and occurrence of each period of 
silence, voiced sound and unvoiced sound with- 
in the speech utterance is detected. Next, a 
single representative data frame which may be 
repetitively utilized to approximate each voiced 
sound is iteratively determined, along with the 
duration of each voiced sound. The spectral 
content of each unvoiced sound, along with 
variations in the amplitude thereof Is also deter- 
mined. A compressed data presentation is then 
created which includes encoded represen- 
tations of a duration of each period of silence, a 
duration and single representative data frame 
for each voiced sound and a spectral content 
and amplitude variations for each unvoiced 
sound. The compressed data representation 
may then be utilized to regenerate the speech 
utterance without substantial loss in Intelligibi- 
lity. 
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BACKGROUND OF THE INVENTION 

1 . Technical Field: 

The present invention relates in general to meth- 
ods and systenns for speech signal data manipulation 
and in particular to Improved methods and systems 
for compressing digital data representations of hu- 
man speech utterances. Still more particularly, the 
present invention relates to a method and system for 
compressing digital data representations of human 
speech utterances utilizing the repetitive nature of 
voiced sounds contained therein. 

2. Description of the Related Art: 

Modern communications and information net- 
works often require the use of digital speech, digital 
audio and digital video. Transmission, storage, con- 
ferencing and many other types of signal processing 
for information, manipulation and display utilize 
these types of data. Basic to all such applications of 
traditionally analog signals are the techniques util- 
ized to digitize those waveforms to achieve accept- 
able levels of signal quality for these applications. 

A straightforward digitization of raw analog 
speech signals is, as those skilled in the art will ap- 
preciate, very inefficient. Raw speech data is typical- 
ly sampled at anywhere from eight thousand samples 
per second to over forty-four thousand samples per 
second. Sixteen-to-eight bit companding and Adap- 
tive Delta Pulse Code Modulation (ADPCM) may be 
utilized to achieve a 4:1 reduction in data size; how- 
ever, even utilizing such a compression ratio the tre- 
mendous volume of data required to store speech sig- 
nals makes voice-annotated mail, LAN -transmit ted 
speech and personal computer based telephone an- 
swering and speaking software applications ex- 
tremely cumbersome to utilize. For example, a one 
page letter containing two kilobytes of digital data 
might have attached thereto a voice message of fif- 
teen seconds duration, which may occupy 160 kilo- 
bytes of data. Multimedia applications of recorded 
speech are similarly hindered by the size of the data 
required and are typically confined to high-density 
storage media, such as CD-ROM. 

As a consequence of the large amounts of data 
required and the desirability of utilizing speech or dig- 
ital audio within a data processing system numerous 
techniques have been proposed for compressing the 
digital data representation of speech signals. For ex- 
ample, International Business Machines Corporation 
Technical Disclosure Bulletin, July 1981, pages 1017- 
1 01 8, discloses a technique whereby compression r - 
cording and expansion of asymmetrical speech 
waves may be accomplished. As described therein, 
the first cycle of each pitch period during a voiced 
sound period is utilized fo r compression and recon- 



struction of the speech. This technique is premised 
upon the observation that within most pitch periods 
the first one-fourth to one-fifth of the waveform is 
significantly larger in amplitude than subsequent 

5 portions of the waveform. 

This first portion of the waveform is thought to 
contain nearly a II of the frequency components that 
the remainder of the waveform contains and conse- 
quently only a fractional portion of the waveform is 

10 utilized for compression and reconstruction. When 
an unvoiced sound is encountered during a speech 
signal utilizing this technique one of two procedures 
are utilized. Either the unvoiced speech is digitized 
and stored in its entirety, o r a single millisecond of 

15 sound along with the length of time that the unvoiced 
sound period lasts is encoded. During reconstruction 
the single sampled pitch period is replicated at de- 
creasing levels of amplitude for a period of tim e equal 
to the voiced sound. While this technique represents 

20 an excellent data compression and reconstruction 
method it suffers from some loss of intelligibility. 

Other techniques utilize high sampling rates to 
faithfully reproduce the random noise aspects of un- 
voiced speech; however, these techniques require 

25 substantial levels of data and do not take into account 
the essential qualities which determine speech intel- 
ligibility. 

In view of the above, it should be apparent that a 
need exists for a method and system which may be 
30 utilized to efficiently compress speech and data and 
yet permit regeneration of that data without a sub- 
stantial loss in speech intelligibility. 
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SUMMARY OF THE INVENTION 



It Is therefore one object of the present invention 
to provide an improved method and system for 
speech signal data manipulation within a data proc- 
essing system. 

40 It is another object of the present invent ion to pro- 

vide an Improved method and system for compress- 
ing digital data representations of human speech ut- 
terances within a data processing system. 

It is yet another object of the present invention to 

45 provide an improved method and system for com- 
pressing digital data representations of human 
speech utterances within a data processing system 
which takes advantage of the repetitive nature of 
voiced sounds within human speech. 

50 The foregoing objects are achieved as is now de- 

scribed. The method and system of the present inven- 
tion may be utilized to create a compressed data rep- 
resentation of a human speech utterance which may 
be utilized to accurately r generate the human 

55 speech utt ranee. First, the location and occurrence 
of each period of silence, voiced sound and unvoiced 
sound within the speech utterance is detect d. Next, 
a single representative data frame which may be re- 
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petitively utilized to approximate each voiced sound 
is Herat ively determined, along with the duration of 
each voiced sound. The spectral content of each un- 
voiced sound, along with variations in the amplitude 
thereof is also determined. A compressed data pre- 
sentation is then created which includes encoded 
representations of a duration of each period of si- 
lence, a duration and single representative data 
frame for each voiced sound and a spectral content 
and amplitude variations for each unvoiced sound. 
The compressed data representation may then be 
utilized to regenerate the speech utterance without 
substantial toss in intelligibility. 

The above as well as additional objects^ features, 
and advantages of the present invention will become 
apparent in the following detailed written description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The novel features believed characteristic of the 
invention are set forth in the appended claims. The 
invention itself however, as well as a preferred mode 
of use. further objects and advantages thereof, will 
best be understood by reference to the following de- 
tailed description of an illustrative emtx)diment when 
read in conjunction with the accompanying drawings, 
wherein: 

Figure 1 is a pictorial representation of a data 

processing system which may be utilized to im- 
plement t he method and system of the present in- 
vention; 

Figure 2 is a high level data flow diagram of the 
process of creating a compressed digital repre- 
sentation of a speech utterance in accordance 
with the method and system of the present inven- 
tion; 

Figure 3 is a pictorial representation of the proc- 
ess of analyzin g a voiced sound in accordance 
wit h the method and system of the present inven- 
tion; and 

Figure 4 is a high level data flow diagram of the 
process of regenerating a speech utterance in 
accordance with the method and system of the 
present invention. 

DETAILED DESCRIPTION OF PREFERRED 
EMBODIMENT 

With reference now to the figures and in particu- 
lar with reference to Figure 1, there is depicted a pic- 
torial representation of a data processing system 10 
which may be utilized to implement the method and 
system of the present invention. As illustrated, data 
processing system 10 includes a processor unit 12, 
which is coupled to a display 14 and keyboard 16, in 
a manner well known to those having ordinary skill in 
the art. Additionally, a microphone 18 is depicted and 
may be utilized to input human speech utterances for 
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dt^jitization and manipulation, in accordance with the 
method and system of the present invention. Of 
course, those skilled in the art will appreciate that hu- 
man speech utterances previously digitized may be 

5 input into data processing system 1 0 for manipulation 
in accordance with the method and system of the 
present invention by storing those utterances as dig- 
ital representations within storage media, such as 
within a magnetic disk. 

10 Data processing system 1 0 may be i mplemented 

utilizing any suitable computer, such as. for example, 
the International Business Machines Corporation 
PS/2 personal computer. Any suitable digital comput- 
er which can manipulate digital data in a manner de- 

15 scribed herein may be utilized to create a composed 
digital data representation of human speech and the 
regeneration of speech utterances, utilizing the 
method and system of the present invention, may be 
performed utilizing an add-on processor card which 

20 includes a digital signal processor (DSP) integrated 
circuit, a software application or a low-end dedicated 
hardware device attached to a communications port. 

Referring now to Figure 2. there is depicted a 
high level dataflowdiagramof the process of creating 

25 a compressed digital representation of a speech ut- 
terance, in accordance with the method and system 
of the present invention. As illustrated, a digital signal 
representation of the speec h utterance is coupled to 
data input 20. Data input 20 is coupled to silence de- 

30 tector 22. In the depicted embodiment of the present 
invention silence detector 22 merely comprises a 
threshold circuit which generates an output indicative 
of a period of silence, if the signal at input 20 does not 
exceed a predetermined^level. 

35 The digitized representation of the speech signal 

is also coupled to low pass filter 24. Low pass filter 
24 is preferably utilized prior to apply ing the digitized 
speech signal to pitch extractor 22 to ensure that 
phase-jitter among high amplitude, high frequency 

40 components do not skew the judgement of voice fun- 
damental period within pitch extractor 26. The pres- 
ence of a voiced sound within the speech utterance 
is then determined by coupling a threshold detector 
30 to the output of pitch extractor 26 to verify the 

45 presence of a voiced sound and to permit a coded 
representation of the voiced sound to be processed, 
in accordance with the method and system of the 
present invention. 

In a manner which will be explained in greater de- 

50 tall herein, pitch extractor 26 is utilized to identify a 
single representative data frame which, when utilized 
repetitively, most nearly approximates a voiced 
sound within a human speech utterance. This is ac- 
complished by analyzing the speech signal applied to 

55 pitch extractor 26 and determining a frame width W 
for this representative data frame. As will be ex- 
plained in greater detail below , this frame width W is 
determined iteratively by determining the particular 

3 
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frame width which results in a representative data 
frame which best identifies a repeating unit within 
each voiced sound. Next, the raw input speech signal 
is applied to representative data frame reconstructor 
28 which utilizes the width information to construct 
an image of the single representative data frame 
which best characterizes each voiced speech sound, 
when utilized In a repetitive manner. It should be not- 
ed that the latter technique is applied to the raw 
speech signal which has not been filtered by low pass 
filter 24. 

The output of representative data frame recon- 
structor 28, which consists of a representative frame 
and frame width, is then applied to repeat-length ana- 
lyzer 32. Repeat-length analyzer 32 is utilized to 
process through the speech signal in a time-wise 
fashion, when enabled by the output of threshold de- 
tector 30, and to determine the number of represen- 
tative data frames which must be replicated to ade- 
quately represent each voiced sound. The output of 
repeat-length analyzer 32 then consists of the image 
of the representative data frame, the width of that 
frame and the number of those frames which are nec- 
essary to replicate the current voiced sound within 
the speech utterance. 

The residual signal output from representative 
data frame reconstructor 28 is applied to sibilant ana- 
lyzer 34. Sibilant analyzer 34 is employed whenever 
there is a substantial residual signal from the pitch 
extraction/representative data frame construction 
procedure which indicates the presence of sibilant or 
unvoiced quantities within the speech signal. The un- 
voiced nature of sibilant sounds is generally charac- 
terized as a filtered white noise signal. Sibilant ana- 
lyzer 34 is utilized to characterize sibilant o r unvoiced 
sounds by detecting the start and stop time of such 
sounds and then performing a series of Fast Fourier 
transforms (FFTs), which are then averaged to ana- 
lyze the overall spectral content of the unvoiced 
sound. Next, the unvoiced sound is subdivided into 
multiple time slots and the average amplitude of the 
signal within each time slot Is summarized to derive 
an amplitude envelope. Thus, the output of sibilant 
analyzer 34 constitutes the spectral values of the un- 
voiced sound, the duration of the unvoiced sound and 
a sequence of amplitude values, which may be ap- 
pended the output data stream to represent the un- 
voiced sound. 

The process described above results in a com- 
pression output data stream which is created utilizing 
encoded representations of the duration of ach per- 
iod of silence, a duration and single representative 
data frame for each voiced sound and an encoded 
representation of the spectral content and amplitude 
envelope representative of each unvoiced sound. 
This process may be accomplished in a random data 
access process; however, the data may generally be 
processed in sequence, analyzing short segments of 



the speech signal in sequential order. The output of 
this process is an ordered list of data and instruction 
codes. 

Further compression may be obtained by proc- 

5 essing this output stream utilizing voiced store/recall 
manager 38 and sibilant store/recall manager 40. For 
example, voiced store/recall manager 38 may be util- 
ized to scan the output stream for the presence of re- 
peating unit images which may be temporarily cata- 

10 logued within voiced store/recall manager 38. There- 
after, logic within voiced store/recall manager 38 may 
be utilized to decide whether waveform images may 
be replaced by recalling a previously transmitted wa- 
veform and applying transformations, such as scal- 
es Ing or phase shifting to that waveform. In this manner 
a limited number of waveform storage locations 
which may be available at the time of decon^ipression 
may be efficiently utilized. Further, the output stream 
may be processed within voice store/recall manager 

20 38 in any manner suitable for utilization with the de- 
compression data processing system by modifying 
the output stream to replace the load instructions 
with store, recall and transformation instructions 
suitable for the decompression technique utilized. 

25 Similarly, sibilant store/recall ma nager 40 may 

be utilized to analyze the output data stream for re- 
current spectral data which may be stored and recal- 
led in a similar manner to that described above with 
respect to voiced sounds. Typically, there are only 

30 four or five different sibilant spectra for an individual 
speaker, which greatly enhances the compres- 
sion/decompression effectiveness. 

With reference now to Figure 3, there is depicted 
a pictorial representation of the process for analyzing 

35 a voiced sound, in accordance with the method and 
system of the presentinvention. As depicted, a voiced 
sound sample is illustrated at reference numeral 50 
which Includes a highly repetitive waveform 52. First, 
an assumed width for a representative data frame is 

40 selected. As depicted at reference numeral 54, when 
a poor assumption for the width of the representative 
data frame has been selected the waveform within 
each assumed frame differs substantially. The proc- 
ess proceeds by analyzing the input sample in con- 

45 secutive frames of width W, and copying each wave- 
form from within an assumed frame width into a sam- 
ple space. Adjacent sections of the Input sample are 
then averaged and, if the representative data frame 
width is poorly chosen, the average of consecutive 

50 data frames will reflect the cancellation of adjacent 
samples, in the manner depicted at reference numer- 
al 58. 

Referring again to Input sample 50, if a proper as- 
sumption is selected for the width of the representa- 
55 tive data frame, the signal present within each frame 
within the input sample will be substantially identical, 
as depicted at refer nee numeral 56. By repeatedly 
averaging the signal within each assumed data frame 
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the result will be a high signal content, as depicted at 
blocK 60. indicating that a proper width for th repre- 
sentative data frame has been chosen. This process 
may be accomplished in a straightforward it ratlve 
fashion. For example, sixty-four different values of 
the represeniali ve data frame width may be chosen 
covering one octave, from eighty-six hertz to one 
hundred and seventy-two hertz. The effective reso- 
lution then ranges from 0.6 hertz to 2.6 hertz and an 
effective single representative data frame may be ac- 
curately chosen, by stepping through each possible 
frame width until such time as the averaging of sig- 
nals within each frame results in a high signal con- 
tent as depicted at reference numeral 60 within Fig- 
ure 3. 

Finally, referhng to Figure 4, there is depicted a 
high level da ta flow diagram of the procedure for re- 
generating a speech utterance in accordance with 
the method and system of the present invention. As 
illustrated, the regenerat ion algorithm operates upon 
the compressed data in a sequential manner. As the 
data and instructions within the compressed digital 
representation of the speech utterance are process- 
ed, it may be output immediately to a sound generator 
or stored as a sound data file. The compressed digital 
representation is applied at input 70 to reconstruction 
command processor 72. Reconstruction command 
processor 72 may be implemented utilizing data proc- 
essing system 10 (see Figure 1). 

First, the reconstruction of voiced sounds will be 
described. The image of a representative data frame 
is applied to waveform accumulator 78. Waveform ac- 
cumulator 78 utilizes waveforms which may be ob- 
tained from waveform storage 82 and thereafter out- 
puts representative data frames through repeater 80. 
Waveform transformation control 76 is utilized to 
control the output of waveform accumulator 78 utiliz- 
ing instructions such as: load waveform accumulator 
with the following waveform; repeat the content of 
waveform accumulator N times; store the content of 
waveform accumulator into a designated storage lo- 
cation: recall into the waveform accumulator what is 
in a designated storage location; rotate the content of 
waveform accumulator by N samples; scale the am- 
plitude of waveform accumulator contents by a factor 
of S; enter zeros for N samples to recreate a period 
of silence; or, copy the data input literally from line 74. 
Those skilled in the a rt will appreciate that certain 
anomalous speech signals, such as plosives, may 
simply be digitized directly without encoding and re- 
generation of those waveforms is simply accomplish- 
ed by regenerating directly from the digitized sam- 
ples. Thus, utilizing the instructions described 
above, or additional instructions or variations of 
these instructions, a voiced sound may be regener- 
ated in the manner described. 

The regeneration of unvoiced speech, such as 
sibilant sounds, is accomplished utilizing a white 
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noise generator 86 which is coupled through an am- 
plitude gate 88 to a 64 point digital filter 90. Envelope 
data representative of amplitude variations within the 
unvoiced sound are applied to current envelope 

5 memory 84 and utilized to vary the amplitude gate 
88. Similarly, the spectral content of the unvoiced 
sound is applied to inverse direct Fourier transform 
92 to derive a 64 point impulse response, utilizing 
current impulse response circuit 94. This impulse re- 

10 sponse may be created utilizing stored impulse re- 
sponse data as indicated at reference numeral 96, 
and the impulse response is thereafter applied as fil- 
ter coefficients to digital filter 90, resulting in an un- 
voiced sound which contains substantially the same 

15 spectral content and amplitude envelope as the orig- 
inal unvoiced speech sound. 

Instructions for accomplishing the regeneration 
of unvoiced sounds within the input data may Include: 
load a particular impulse response; load an envelope 

20 of length N; trigger the occurrence of a sibilant ac- 
cording to the current settings; store the current inr^- 
pulse response in an impulse response storage loca- 
tion; or. recall the current impulse response from a 
designated storage location. 

25 Upon reference to the foregoing those skilled in 

the art will appreciate that the method and system of 
the present invention may be utilized to compress a 
digital data representation of a speech signal and re- 
generate speech from that compressed digital repre- 

30 sentation by taking advantage of the fact that the 
voiced portion of a speech signal typically consists of 
a repeating waveform {the vocal fundamental fre- 
quency and all of its phase-locked harmonics) which 
remains relatively stable for the duration of several 

35 cycles. This permits representation of each voiced 
speech sound as a single image of a repeating unit, 
with a repeat count. Subsequent voiced speech 
sounds tend to be slight modifications of previously 
voiced speech sounds and therefore, a waveform 

40 previously communicated and regenerated at the de- 
compression end may be referenced and modified to 
serve as a new repeating unit image. These modifi- 
cations to a previous image, which might include am- 
plitude scaling, frequency scaling, or phase shifting 

45 are much more compactly encoded than a complete 
new digital waveform image. 

Similarly, the unvoiced or sibilant portions of 
speech are essentially random noise which has been 
filtered by, at most, two different fitters. By charac- 

50 terizing the spectral content and the amplitude envel- 
ope of an unvoiced speech sound the method and 
system of the present invention may be utilized to 
compress a digital representation of a speech signal 
and regenerate that signal into spe ch data with very 

55 little loss of intelligibility . 
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Claims 

1. A method for creating a compressed data repre- 
sentation of a human speech utterance which in- 
cludes voiced sounds and unvoiced sounds, said 
method comprising the steps of: 

detecting each occurrence of a voiced sound 
within said human speech utterance; 
analyzing each detected occurrence of a voiced 
sound within said human speech utterance to de- 
termine a duration thereof and a single represen- 
tative data frame which when utilized repetitively 
most nearly approximates said voiced sound; 
detecting each occurrence of an unvoiced sound 
within said human speech utterance; 
analyzing each detected occurrence of an un- 
voiced sound within said human speech utter- 
ance to determine a spectral content thereof and 
amplitude variations therein; and 
creating a compressed data representation of 
said human speech utterance which includes an 
encoded representation of duration and a single 
representative data frame representative of each 
voiced sound and an encoded representation of 
a spectral content and amplitude variations rep- 
resentative of each unvoiced sound. 

2. The method for creating a compressed data rep- 
resentation of a human speech utterance accord- 
ing to Claim 1, wherein said human speech utter- 
ance includes periods of silence and wherein said 
method further includes the step of detecting 
each occurrence of a period of silence within said 
human speech utterance. 

3. The method for creating a compressed data rep- 
resentation of a human speech utterance accord- 
ing to Claim 2, further including the step of deter- 
mining a duration of each detected occurrence of 
a period of silence. 

4. The method for creating a compressed data rep- 
resentation of a human speech utterance accord- 
ing to Claim 3, wherein said step of creating a 
compressed data representation of said human 
speech utterance further includes the step of In- 
cluding an encoded representation of said dura- 
tion of each period of silence. 

5. The method for creating a compressed data rep- 
resentation of a human speech utterance accord- 
ing to Claim 1, wherein said step of analyzing 
each detected occurrence of a voiced sound 
wit hin said human speech utterance to determine 
a duration thereof and a single representative 
data frame which when utiliz d repetitively most 
nearly approximates said voiced sound compris- 
es the steps of determining a duration thereof, 



assuming a width W for a single representative 
data frame and thereafter additively accumulat- 
ing successive frames of width W of said voiced 
sound for various assumed widths until succes- 
5 sive f rames additively reinforce one another, at a 

selected assumed width. 

6. The method for creating a compressed data rep- 
resentation of a human speech utterance accord- 

10 ing to Claim 1, wherein said step of analyzing 

each detected occurrence of an unvoiced sound 
within said human speech utterance to determine 
a spectral content thereof and amplitude varia- 
tions therein comprises the steps of performing 

15 a series of Fourier transforms upon each un- 

voiced sound to determine a spectral content 
thereof and determining an average amplitude 
during each of a plurality of time frames within 
said unvoiced sound. 

20 

7. The method for creating a compressed data rep- 
resentation of a human speech utterance accord- 
ing to Claim 1 , further including the step of regen- 
erating a human speech utterance utilizing said 

25 compressed data representation. 

8. A system for creating a compressed data repre- 
sentation of a human speech utterance which in- 
cludes voiced sounds and unvoiced sounds, said 

30 system comprising: 

means for detecting each occurrence of a voiced 
sound within said human speech utterance; 
means for analyzing each detected occurrence of 
a voiced sound within said human speech utter- 

35 ance to determine a duration thereof and a single 

representative data frame which when utilized 
repetitively most nearly approximates said 
voiced sound; 

means for detecting each occurrence of an un- 
40 voiced sound within said human speech utter- 

ance; 

means for analyzing each detected occurrence of 
an unvoiced sound within said human speech ut- 
terance to determine a spectral content thereof 

45 and amplituJe variations therein; and 

means for creating a compressed data represen- 
tation of said human speech utterance which in- 
cludes an encoded representation of duration 
and a single representative data frame represen- 

50 tative of each voiced sound and an encoded rep- 

resentation of a spectral content and amplitude 
variations representative of each unvoiced 
sound. 

55 9. The system for creating a compressed data rep- 
resentation of a human speech utterance accord- 
ing to Claim 1 , wher in said human speech utter- 
ance includes p riods of silence and wherein said 
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system further includes means for detecting 
each occurrence of a period of silence within said 
human speech utterance. 



10. The system for creating a compressed data rep- 
resentation of a human speech utterance accord- 
ing to Claim 9, further including means for deter- 
mining a duration of each detected occurrence of \t 
a period of silence. i] 

11. The system for creating a compressed data rep- 
resentation of a human speech utterance accord- 
ing to Claim 10, wherein said means for creating 
a compressed data representation of said human 
speech utterance further includes means ^for in- ; 
duding an encoded representation of said/dura- 
tion of said period of silence. *' ' ' ' 
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12. The system for creating a compressed data rep- 
resentation of a human speech utterance accord- 20 
ing to Claim 8. wherein said means for analyzing ' 
each detected occurrence of a voiced sound 
within said human speech utterance to determine 
a duration thereof and a single representative 
data frame which when utilized repetitively most 25 
nearly approximates said voiced sound compris- 
es means for determining a duration thereof, 
means for assuming a width W for a single repre- 
sentative data frame and for thereafter additively . 
accumulating successive frames" of width'W of~ 30 
said voiced sourid for various^ assumed widths 
until successive frames additively Yelnforce one*, 
another at a selected assumed width. 



13. The system for creating a compressed data rep- 35 
resentation of a human speech utterance accord- 
ing to Claim 8. wherein said means for ianalyzing 
each detected occurrence of an unvoiced sound 
within said human speech utterance to'deteYrhine 
a spectral content thereof and amplitude^varia- 40, 
tions therein comprises means for, performing a 
series of Fourier^transforms upon each unyqiced 
sound to determine a spectral content s hereof 
and means for determining an average amplitude 
during each of ia plurality of time frames' withiTI ^ 45 
said unvoiced sound. 



14. The system for cremating a compressed data rep- 
resentation of a human speech utterance accord- 
ing to Claim 8, f urt her including means for regen- 
erating a human speech utterance utilizing said 
compressed data representation. 
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